Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure no redundant rcl_logging initialization and finalization. #560

Conversation

fujitatomoya
Copy link
Collaborator

this is fix for #551.

the following, i would like to hear from others.

  • Does it have to introduce thread safe? once mutex is introduced, there would be pain for maintenance and possibilities to increase deadlock. (which we do not really like...)
  • it seems to be probable we can avoid multiple call for rcl_logging init/init without thread safe. and even if that happens, there would be no such problem for user application.

Copy link
Member

@ivanpauno ivanpauno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments.

Besides that, rcl_logging_init is actually called more than once when multiple context are created.
How this new return value will be handled there?

Actually, as I commented in the issue, we should ensure that the logger is not finished until the last context is finished. So we should probably do some kind of "reference" counting, instead of just a boolean indicating if it was already init or not.

rcl/src/rcl/logging.c Outdated Show resolved Hide resolved
rcl/src/rcl/logging.c Outdated Show resolved Hide resolved
rcl/src/rcl/logging.c Outdated Show resolved Hide resolved
Signed-off-by: Tomoya.Fujita <Tomoya.Fujita@sony.com>
@@ -77,6 +81,7 @@ rcl_logging_configure(const rcl_arguments_t * global_args, const rcl_allocator_t
g_rcl_logging_ext_lib_enabled = !global_args->impl->log_ext_lib_disabled;
rcl_ret_t status = RCL_RET_OK;
g_rcl_logging_num_out_handlers = 0;
(void) rcutils_atomic_fetch_add_uint64_t(&__rcl_logging_init_count, 1);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if multiple contexts exist, this will be called more than once. so that there should be no warning.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the return value should be used.
Like,

if(rcutils_atomic_fetch_add_uint64_t(&__rcl_logging_init_count, 1)) {
  return RMW_RET_OK;  // already inited, just return.
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think this should be the first thing done in the function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are not sure rcl_logging_init is actually done without any error, what if argument is missing?
counter == 1 does only mean rcl_logging_init is called, but rcl_logging_init is succeeded.

@@ -107,6 +112,12 @@ rcl_logging_configure(const rcl_arguments_t * global_args, const rcl_allocator_t
rcl_ret_t rcl_logging_fini()
{
rcl_ret_t status = RCL_RET_OK;
uint64_t current_count = rcutils_atomic_fetch_add_uint64_t(&__rcl_logging_init_count, -1);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using atomic is not exactly thread safe, but expecting good enough for this case. besides we do not really want to introduce mutex in here.

rcl/src/rcl/logging.c Outdated Show resolved Hide resolved
@fujitatomoya
Copy link
Collaborator Author

@ivanpauno

thanks for enlighten me, could you check once again?

@ivanpauno
Copy link
Member

using atomic is not exactly thread safe, but expecting good enough for this case. besides we do not really want to introduce mutex in here.

I think that in this case (doing an atomic fetch add followed by a guard that ensures that the following code is executed only under an specific previous value of the atomic) should be thread safe, right?

@@ -77,6 +81,7 @@ rcl_logging_configure(const rcl_arguments_t * global_args, const rcl_allocator_t
g_rcl_logging_ext_lib_enabled = !global_args->impl->log_ext_lib_disabled;
rcl_ret_t status = RCL_RET_OK;
g_rcl_logging_num_out_handlers = 0;
(void) rcutils_atomic_fetch_add_uint64_t(&__rcl_logging_init_count, 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the return value should be used.
Like,

if(rcutils_atomic_fetch_add_uint64_t(&__rcl_logging_init_count, 1)) {
  return RMW_RET_OK;  // already inited, just return.
}

@@ -77,6 +81,7 @@ rcl_logging_configure(const rcl_arguments_t * global_args, const rcl_allocator_t
g_rcl_logging_ext_lib_enabled = !global_args->impl->log_ext_lib_disabled;
rcl_ret_t status = RCL_RET_OK;
g_rcl_logging_num_out_handlers = 0;
(void) rcutils_atomic_fetch_add_uint64_t(&__rcl_logging_init_count, 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think this should be the first thing done in the function.

@@ -107,6 +112,12 @@ rcl_logging_configure(const rcl_arguments_t * global_args, const rcl_allocator_t
rcl_ret_t rcl_logging_fini()
{
rcl_ret_t status = RCL_RET_OK;
uint64_t current_count = rcutils_atomic_fetch_add_uint64_t(&__rcl_logging_init_count, -1);

if (current_count != 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't be != 0? The returned value by fetch add is the previous value of the atomic.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, current_count is fetched before addition(subtraction), confirmed this with simple test program.
current_count == 1 means that initialized once, so that needs to be finalized.
current_count != 1 means that initialized multiple times.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry you're right. But current_count sounds confusing, maybe previous_count?

…ucceed.

Signed-off-by: Tomoya.Fujita <Tomoya.Fujita@sony.com>
@fujitatomoya
Copy link
Collaborator Author

@ivanpauno

i did reconsider the code with expectation with rcl logging init/fini always succeed.
i believe that this should be good enough, but if you want to have 100% exclusive including rcl_logging_init/fini failure cases, i need either rwlock or mutext to keep the other threads wait.

could you share your thoughts?

@@ -71,7 +71,9 @@ rcl_logging_init(const rcl_arguments_t * global_args, const rcl_allocator_t * al
{
RCL_CHECK_ARGUMENT_FOR_NULL(global_args, RCL_RET_INVALID_ARGUMENT);
RCL_CHECK_ARGUMENT_FOR_NULL(allocator, RCL_RET_INVALID_ARGUMENT);

if (rcutils_atomic_fetch_add_uint64_t(&__rcl_logging_init_count, 1)) {
return RCL_RET_OK;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should log something here, to not return silently.
Maybe, an user might expect to be able to configure a logging of two contexts differently.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, that is one of the cases, let's add it to notify the user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but instead I'd prefer for this to fail instead (return an error code). That way the caller can decide how to react to this. As-is the caller cannot tell if the function succeeded or "succeeded" but didn't use the given settings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I think the better solution would look like this (from the caller side):

Context::Context(...)
{
  // ...
  std::lock_guard<std::mutex> lock(g_logging_mutex);  // ensures thread-safety
  if (rcl_logging_is_initialized()) {
    auto ret = rcl_logging_init(...);
    if (ret != RCL_RET_OK) {
      // error handling
    }
  } else {
    auto ret = rcl_logging_increase_ref_count();  // safe because of lock above
    if (ret != RCL_RET_OK) {
      // error handling
    }
  }
  // ...
}

Context::~Context()
{
  // ...
  std::lock_guard<std::mutex> lock(g_logging_mutex);  // ensures thread-safety
  auto ret = rcl_logging_decrease_ref_count();  // will implicitly fini() if ref_count is zero
  if (ret != RCL_RET_OK) {
    // error handling
  }
  // ...
}

So that it is more explicit what's happening.

Or in the case where something is using logging but doesn't have what is required to initialize it (requires something else initialized it first):

{
  // ...
  std::lock_guard<std::mutex> lock(g_logging_mutex);  // ensures thread-safety
  if (rcl_logging_increase_ref_count() == RCL_LOGGING_NOT_INITIALIZED) {
    // error handling because it was unexpectedly not initialized...
  }
  // ...
}

Not sure there's a use case for the second example, but either way I think having init return OK but not actually consider the input arguments is probably not good. But at the same time you need to see if it is initialized and increase the reference count some how.


Also, out of scope for this pr, but logging should probably not require the use of global... This is only a problem because we're mapping non-globals (multiple context) to the initialization of a global. If each context could create its own logging instance this would not be a problem.

@ivanpauno
Copy link
Member

i did reconsider the code with expectation with rcl logging init/fini always succeed.

I think that's fine, I'm not sure if we should log something or in some cases indicate an error.
@hidmic @wjwwood for more feedback.

i believe that this should be good enough, but if you want to have 100% exclusive including rcl_logging_init/fini failure cases, i need either rwlock or mutext to keep the other threads wait.

I think there is no need of an extra rwlock/mutex, but a second reviewer may think different.

@ivanpauno ivanpauno requested a review from hidmic January 28, 2020 12:58
@ivanpauno ivanpauno added the in review Waiting for review (Kanban column) label Jan 28, 2020
{
RCL_CHECK_ARGUMENT_FOR_NULL(global_args, RCL_RET_INVALID_ARGUMENT);
RCL_CHECK_ARGUMENT_FOR_NULL(allocator, RCL_RET_INVALID_ARGUMENT);
if (rcutils_atomic_fetch_add_uint64_t(&__rcl_logging_init_count, 1)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we counting and not just using a flag? This function is marked as not thread-safe, and rcl_logging_fini as well. So we don't have to handle the case where this is called multiple times concurrently. It's up to the code above this to ensure it isn't called in a thread unsafe way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see why a counter, but why atomic?

rcl/src/rcl/logging.c Outdated Show resolved Hide resolved
@ivanpauno
Copy link
Member

@wjwwood I suggested doing reference counting, because the creation/destruction of each context calls rcl_logging_configure and rcl_logging_fini:

ret = rcl_logging_configure(&context->global_arguments, &allocator);

rcl_ret_t rcl_ret = rcl_logging_fini();

If in rclcpp you do:

rclcpp::init(argc, argv);
...
{
  auto context1 = std::make_shared<rclcpp::Context >();
  context1->init(0, nullptr);
}
RCLCPP_INFO(...); // Logging was finished when the above context shutdown.
...

I think that's a problem.

rcl/src/rcl/logging.c Outdated Show resolved Hide resolved
rcl/src/rcl/logging.c Outdated Show resolved Hide resolved
{
RCL_CHECK_ARGUMENT_FOR_NULL(global_args, RCL_RET_INVALID_ARGUMENT);
RCL_CHECK_ARGUMENT_FOR_NULL(allocator, RCL_RET_INVALID_ARGUMENT);
if (rcutils_atomic_fetch_add_uint64_t(&__rcl_logging_init_count, 1)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see why a counter, but why atomic?

@@ -71,7 +71,9 @@ rcl_logging_init(const rcl_arguments_t * global_args, const rcl_allocator_t * al
{
RCL_CHECK_ARGUMENT_FOR_NULL(global_args, RCL_RET_INVALID_ARGUMENT);
RCL_CHECK_ARGUMENT_FOR_NULL(allocator, RCL_RET_INVALID_ARGUMENT);

if (rcutils_atomic_fetch_add_uint64_t(&__rcl_logging_init_count, 1)) {
return RCL_RET_OK;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but instead I'd prefer for this to fail instead (return an error code). That way the caller can decide how to react to this. As-is the caller cannot tell if the function succeeded or "succeeded" but didn't use the given settings.

@@ -71,7 +71,9 @@ rcl_logging_init(const rcl_arguments_t * global_args, const rcl_allocator_t * al
{
RCL_CHECK_ARGUMENT_FOR_NULL(global_args, RCL_RET_INVALID_ARGUMENT);
RCL_CHECK_ARGUMENT_FOR_NULL(allocator, RCL_RET_INVALID_ARGUMENT);

if (rcutils_atomic_fetch_add_uint64_t(&__rcl_logging_init_count, 1)) {
return RCL_RET_OK;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I think the better solution would look like this (from the caller side):

Context::Context(...)
{
  // ...
  std::lock_guard<std::mutex> lock(g_logging_mutex);  // ensures thread-safety
  if (rcl_logging_is_initialized()) {
    auto ret = rcl_logging_init(...);
    if (ret != RCL_RET_OK) {
      // error handling
    }
  } else {
    auto ret = rcl_logging_increase_ref_count();  // safe because of lock above
    if (ret != RCL_RET_OK) {
      // error handling
    }
  }
  // ...
}

Context::~Context()
{
  // ...
  std::lock_guard<std::mutex> lock(g_logging_mutex);  // ensures thread-safety
  auto ret = rcl_logging_decrease_ref_count();  // will implicitly fini() if ref_count is zero
  if (ret != RCL_RET_OK) {
    // error handling
  }
  // ...
}

So that it is more explicit what's happening.

Or in the case where something is using logging but doesn't have what is required to initialize it (requires something else initialized it first):

{
  // ...
  std::lock_guard<std::mutex> lock(g_logging_mutex);  // ensures thread-safety
  if (rcl_logging_increase_ref_count() == RCL_LOGGING_NOT_INITIALIZED) {
    // error handling because it was unexpectedly not initialized...
  }
  // ...
}

Not sure there's a use case for the second example, but either way I think having init return OK but not actually consider the input arguments is probably not good. But at the same time you need to see if it is initialized and increase the reference count some how.


Also, out of scope for this pr, but logging should probably not require the use of global... This is only a problem because we're mapping non-globals (multiple context) to the initialization of a global. If each context could create its own logging instance this would not be a problem.

rcl/src/rcl/logging.c Outdated Show resolved Hide resolved
@ivanpauno
Copy link
Member

Also, out of scope for this pr, but logging should probably not require the use of global... This is only a problem because we're mapping non-globals (multiple context) to the initialization of a global. If each context could create its own logging instance this would not be a problem.

I agree, but that would modify completly the logging API. e.g:

RCLCPP_INFO(context.get_logger(), ...);

And we wouldn't necessary want a logger per Context, so having an independent object for it would be reasonable. Something like having a singleton logger that's used automatically, and you can create a new one manually. Logging initialization should probably be decoupled of context initialization.

The question is: who is responsible of configuring the logger?
Currently, each context can "configure" logging, but it's kind of not working as the same globals are shared between different contexts.
I think that doing reference counting is a clever way of fixing the problem without modifying the API, but we should probably reconsider the API later.


About the above solution proposed, I think that solving the problem in an upper layer is fine, though we will have to solve the problem independently in each client.
My idea was to solve it in rcl, to avoid duplication. I think there are two ways of solving the problem:

  • rcl_logging_configure and rcl_logging_fini do reference counting internally.
  • The usual caller of rcl_logging_configure and rcl_logging_fini do reference counting, avoiding to call them more than once (the usual caller is rcl_init and rcl_shutdown).

I think that the second option is probable more reasonable, as the caller should be responsible of not initing the logger more than once.
If we solve the problem in rcl, the reference counting should be thread-safe, if not, the upper layer needs a global logging lock (that's fine too, but IMO it's a bit confusing).

So, my idea is something like this:

atomic_uint_least64_t g_logging_ref_count = ATOMIC_VAR_INIT(0);

rcl_init(...)
{
  ...
  if (0u == rcutils_atomic_fetch_add_uint64_t(&g_logging_ref_count, 1)) {
     rcl_logging_configure(...);
  } else {
    // log a warn if logging arguments were passed again.
  }
  ...
} 

rcl_shutdown(...)
{
  ...
  if (1u == rcutils_atomic_fetch_add_uint64_t(&g_logging_ref_count, -1)) {
     rcl_logging_fini(...);
  }
  ...
}

@wjwwood Is that a reasonable solution?

@wjwwood
Copy link
Member

wjwwood commented Jan 31, 2020

My idea was to solve it in rcl, to avoid duplication.

That's fine, but...

If we solve the problem in rcl, the reference counting should be thread-safe...

I don't think that can be done safely without a mutex, and rcl should not ever use OS primitives like a mutex, which is part of the design goal of rcl to not do threading, and other OS related tasks (see: https://docs.ros2.org/eloquent/developer_overview.html#internal-api-architecture-overview).

Using just atomics will not be sufficient in my opinion, or at least I haven't seen a patch that achieves that.

If you think it can be done, then that's fine I'll review a patch for that, but then the documentation needs to be updated to indicate that it is thread-safe.


the reference counting should be thread-safe, if not, the upper layer needs a global logging lock (that's fine too, but IMO it's a bit confusing).

My opinion is that this will be required, regardless of whether or not it is confusing, though I don't see why it is confusing to be honest.

@hidmic
Copy link
Contributor

hidmic commented Jan 31, 2020

Using just atomics will not be sufficient in my opinion, or at least I haven't seen a patch that achieves that.

I don't see why an atomic counter would not ensure thread-safety in this case, though it's arguable whether we should be using them at all, along with other global state, instead of pushing loggers inside contexts. But then the question becomes, does it make sense to have a logger per context? If not, why are contexts initializing logging in the first place? Should logging initialization be done once per process, like signal handlers?

@wjwwood
Copy link
Member

wjwwood commented Jan 31, 2020

I don't see why an atomic counter would not ensure thread-safety in this case,

Well, if you're mutating global state then fini and init cannot be called concurrently. I don't see how the atomic counter can protect you. We can review and discuss a concrete pull request more easily. Do you guys think the current patch is thread-safe? If so I'll re-review it with that in mind, though the doc strings still claim that they are not thread-safe.

But then the question becomes, does it make sense to have a logger per context?

The implications are annoying, but I think it makes sense. Alternatively, perhaps the context should not initialize the logging, as you suggest.

If not, why are contexts initializing logging in the first place?

As to "why", init (rcl_init) used to be global, and it asked the user to pass the argv/argc too. When it was refactored to allow for non-global init, the argv/argc were kept, so they could be used (optionally) for standard ROS command line stuff, even for non-global context usage. Therefore I guess it made sense to initialize logging in the context since initializing logging also needed argv/argc. However, I don't think it's required for that to be the case, i.e. I think logging could be initialized separately from the context if needed.

Should logging initialization be done once per process, like signal handlers?

Maybe. The signal handlers are initialized only if the global rclcpp::init() is used, otherwise it is up to the user to "install the signal handler". So it would be similar, if the user wanted to avoid globals with the context (rclcpp::init()), then they would need to initialize logging manually.

If having library globals is an issue for the user (e.g. due to destruction order issues) then they would just have to forego using our signal handlers and make their own and avoid using the global versions of our functions. For logging this isn't as easy for the user to do since logging calls are made through out the code, so they cannot easily avoid them all.

Also, in my opinion, comparing logging to signal handling is kind of flawed, because the logger could technically be used without globals (think about our deliberations over log4cxx vs spdlog) and ideally our logging API would allow for non-global usage. Conversely, I do not believe the signal handling is technically feasible without using globals.

While important, I think this is well out of scope for this pull request. For this pull request we just need to decide how to make it thread-safe, either in rcl or rclcpp depending on how the atomic issue is resolved.

@hidmic
Copy link
Contributor

hidmic commented Jan 31, 2020

because the logger could technically be used without globals (think about our deliberations over log4cxx vs spdlog) and ideally our logging API would allow for non-global usage. Conversely, I do not believe the signal handling is technically feasible without using globals.

For sure. I meant "...to be initialized on a per process basis". But that doesn't have to be the case really. Logging could be initialized separately, against either global or non-global storage, and made available globally or associated explicitly with contexts or nodes, or simply passed around. Not all of those combinations are sane though.

For this pull request we just need to decide how to make it thread-safe, either in rcl or rclcpp depending on how the atomic issue is resolved.

Fully agreed.

@fujitatomoya
Copy link
Collaborator Author

@wjwwood

Do you guys think the current patch is thread-safe?

No, atomic counter is atomic access, it cannot take care of the failure cases. for supporting failure cases, we surely will need mutex or rwlock to keep the everyone else exclusive in specific time window.

@fujitatomoya
Copy link
Collaborator Author

fujitatomoya commented Feb 3, 2020

@ivanpauno @wjwwood @hidmic

How you guys really want to proceed? Taking care of thread-safe via rclcpp?

IMO, i believe that current implementation is reasonable at certain level but expecting logging init and fini never fails.

Signed-off-by: Tomoya.Fujita <Tomoya.Fujita@sony.com>
@ivanpauno
Copy link
Member

Well, if you're mutating global state then fini and init cannot be called concurrently. I don't see how the atomic counter can protect you. We can review and discuss a concrete pull request more easily. Do you guys think the current patch is thread-safe? If so I'll re-review it with that in mind, though the doc strings still claim that they are not thread-safe.

I proposed an altarnative in #573, just to clarify my idea with code.
I think it's not a definitive solution, rcl_init should probably not be responsible of initializing the loggers. But it's just an easy way to avoid the double initialization/early finishing, and warn the user when they tried to configure the logger more than once.

@ivanpauno
Copy link
Member

I've merged the alternative PRs, sorry for the delay.
Thanks for having work on this @fujitatomoya !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in review Waiting for review (Kanban column)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants